



The Chinese University of Hong Kong

# CSCI2510 Computer Organization Lecture 06: Memory Hierarchy

#### Ming-Chang YANG

mcyang@cse.cuhk.edu.hk

Reading: Chap. 8.1~8.5, Appendix A.5.1

## Basic Functional Units of a Computer



- **Input**: accepts <u>coded information</u> from human operators.
- **Memory**: stores the <u>received information</u> for later use.
- **Processor**: executes the <u>instructions</u> of a <u>program</u> stored in the <u>memory</u>.
- **Output**: reacts to the outside world.
- **Control:** coordinates all these actions. CSCI2510 Lec06: Memory Hierarchy 2022-23 T1

#### Outline



- An Overview of Memory
- Memory Technologies
  - Random Access Memory (RAM)
  - Read-Only Memory (ROM)
  - Non-Volatile Memory (NVM)
- Solution: Memory Hierarchy



https://www.magneticmemorymethod.com/brain-exercises/

## **Revisit: Why Do We Need Memory?**



• Reason: Programs and the data must be held in the memory of the computer to be executed.

| 🙀 Task Manager                                                   | _    |          | <        |          |   |  |  |
|------------------------------------------------------------------|------|----------|----------|----------|---|--|--|
| File Options View                                                |      |          |          |          |   |  |  |
| Processes Performance App history Startup Users Details Services |      |          |          |          |   |  |  |
| ^                                                                | 2%   | 35%      | 0%       | 0%       |   |  |  |
| Name                                                             | CPU  | Memory   | Disk     | Network  |   |  |  |
| Apps (8) in-use!                                                 |      |          |          |          | ^ |  |  |
| > 👗 Adobe Acrobat (32 bit)                                       | 0%   | 62.4 MB  | 0 MB/s   | 0 Mbps   |   |  |  |
| > 🧿 Google Chrome (2)                                            | 0%   | 147.2 MB | 0.1 MB/s | 0.1 Mbps |   |  |  |
| > 💈 Instant Dictionary (32 bit)                                  | 0.2% | 21.1 MB  | 0 MB/s   | 0 Mbps   |   |  |  |
| > 🛐 Microsoft PowerPoint                                         | 0%   | 282.1 MB | 0 MB/s   | 0 Mbps   |   |  |  |
| > Skype (32 bit)                                                 | 0.1% | 67.0 MB  | 0.1 MB/s | 0 Mbps   |   |  |  |
| > 😽 Snipping Tool                                                | 0.5% | 4.6 MB   | 0 MB/s   | 0 Mbps   |   |  |  |
| > 🙀 Task Manager                                                 | 0.5% | 14.6 MB  | 0 MB/s   | 0 Mbps   |   |  |  |
| > 🦰 Windows Explorer                                             | 0.4% | 60.3 MB  | 0 MB/s   | 0 Mbps   | ~ |  |  |

## **Revisit: Memory Basics**

- Most machines are byte-addressable.
  - Each memory address location refers to a byte (B).
- Memory is designed to store/retrieve in words.
  - A word is usually of 16, 32 or 64 bits.
  - Reason? Performance consideration.

- The maximum size of memory that can be addressed is determined by the addressing capability.
  - For example, a 32-bit machine (that uses 32-bit addresses) can utilize a memory that contains <u>up to</u>  $2^{32}$  bytes = 4GB.





## **Simplified View: Processor-Memory**



- Data transferring takes place through MAR and MDR.
  - MAR: Memory Address Register
  - MDR: memory Data Register



\*MFC (Memory Function Completed): Indicating the requested operation has been completed.

## **Class Exercise 6.1**

| Student ID: | <br>Date: |
|-------------|-----------|
| Name:       | <br>      |

- Assume 3-bit address bus (i.e. k=3) and 4-bit data bus (i.e. n=4) are used.
- What will be the contents of MAR, MDR, and the memory after a read or write operation is performed?



## **Memory Cell Organization**

- Memory cells are usually organized as an **array**:
  - Each cell can store one bit of information, and
  - Each row of cells constitutes a memory word.



CSCI2510 Lec06: Memory Hierarchy 2022-23 T1

#### **Class Exercise 6.2**



- In the example, the small memory circuit contains 16 words, and each word has 8 bits.
- How many bits of data can be stored in this memory?
- Answer:
- How many bits of address bus do we needed?
- Answer:
- How many bits of data bus do we needed?
- Answer:
- Is there any control line in the example?
- Answer:

#### Outline



#### An Overview of Memory

- Memory Technologies
  - Random Access Memory (RAM)
  - Read-Only Memory (ROM)
  - Non-Volatile Memory (NVM)
- Solution: Memory Hierarchy

## **Mainstream Memory Technologies**



• There are many types of memory in the market:



https://thememoryguy.com/category/other-current-memory-technologies/

#### Outline



#### An Overview of Memory

- Memory Technologies
  - Random Access Memory (RAM)
  - Read-Only Memory (ROM)
  - Non-Volatile Memory (NVM)
- Solution: Memory Hierarchy

### Random Access Memory (RAM)



- Random Access Memory (RAM): The access (R/W) time to any location is the same, independent of the location's address.
  - Memory Access Time: The time between start and finish of a memory request.
  - That is, we can "randomly" access any location of the RAM with the same memory access time.
- RAM are available in a wide range of types:
  - 1) Static RAM (SRAM)
  - 2) Dynamic RAM (DRAM)
  - 3) Synchronous DRAM (SDRAM)

## Static RAM (SRAM)

- Static RAM (SRAM): Capable of "statically" retaining the cell state (i.e., data) as long as power is applied (i.e., volatile).
  Fast: Access times are on the order of a few nanoseconds.
  - **<u>Low power</u>**:
    - In **SRAM**, continuous power is needed for retaining its state; otherwise, the contents are lost.
    - CMOS SRAM has very low power consumption: current flows only when accessing the cells.



CSCI2510 Lec06: Memory Hierarchy 2022-23 T1



Two inverters are cross-connected to form a latch, which is interconnected two transistors.

#### **CMOS Static RAM**



Two pairs of transistors form the inverters in the latch instead (see Appendix A.5.1).



# Dynamic RAM (DRAM)

- **Dynamic RAM (DRAM)**: Store data in the form of "dynamical" charges on a capacitor.





- Why a DRAM cell is "dynamical"?
  - Charges can be maintained for only tens of milliseconds.
  - That is, the charges will leak away as time goes (i.e., volatile).
- The contents of DRAM cells must be refreshed periodically.
  - By recharging the capacitor.
  - $\rightarrow$  A DRAM cell consumes <u>more power</u> than a SRAM cell.

# Synchronous DRAM (SDRAM) (1/3)



- Synchronous DRAM (SDRAM): Use the same cells as DRAM but use a **clock** to synchronize operations.
  - Why? The refresh operation can be transparent to users.



- Double Data Rate (DDR) SDRAM: Transfer data on both clock edges.
  - The most common type used today as the main memory.

# Synchronous DRAM (SDRAM) (2/3)



• **Memory Modules**: The standard for today's computers to hold multiple SDRAM chips.

**SO-DIMM** (for laptop) <u>S</u>mall <u>O</u>utline <u>D</u>ual <u>I</u>n-line <u>M</u>emory <u>M</u>odule



#### **DIMM** (for desktop) <u>D</u>ual <u>I</u>n-line <u>M</u>emory <u>M</u>odule



# Synchronous DRAM (SDRAM) (3/3)



- Enhanced Versions: DDR-2, DDR-3, and DDR-4
  - They offer larger size, lower power and faster clock rates.
- The table below compares the theoretical maximum bandwidth of different SDRAM types.

| RAM Туре                       | Theoretical Maximum Bandwidth                     |
|--------------------------------|---------------------------------------------------|
| SDRAM 100 MHz (PC100)          | 100 MHz X 64 bit/ cycle = 800 MByte/sec           |
| SDRAM 133 MHz (PC133)          | 133 MHz X 64 bit/ cycle = 1064 MByte/sec          |
| DDR SDRAM 200 MHz (PC1600)     | 2 X 100 MHz X 64 bit/ cycle ~= 1600 MByte/sec     |
| DDR SDRAM 266 MHz (PC2100)     | 2 X 133 MHz X 64 bit/ cycle ~= 2100 MByte/sec     |
| DDR SDRAM 333 MHz (PC2600)     | 2 X 166 MHz X 64 bit/ cycle ~= 2600 MByte/sec     |
| DDR-2 SDRAM 667 MHz (PC2-5400) | 2 X 2 X 166 MHz X 64 bit/ cycle ~= 5400 MByte/sec |
| DDR-2 SDRAM 800 MHz (PC2-6400) | 2 X 2 X 200 MHz X 64 bit/ cycle ~= 6400 MByte/sec |

• SDRAM does not perform as good as the table shown, due to latency.

## Bandwidth vs. Latency

- **Bandwidth**: *The maximal number of bits or bytes that can be transferred in one second*.
- **Latency**: *The amount of time it takes to transfer the first word after issuing an access (i.e., access strobe).*



#### **Class Exercise 6.3**



- Suppose the clock rate is 500 MHz, and each word (i.e., w1, w2, w3) is 16 bits in the previous example. What is the bandwidth and latency on transferring data via the SDRAM?
- Answer:

#### Outline



#### An Overview of Memory

- Memory Technologies
  - Random Access Memory (RAM)
  - Read-Only Memory (ROM)
  - Non-Volatile Memory (NVM)
- Solution: Memory Hierarchy

# Read-Only Memory (ROM) (1/2)



- All types of RAM cells are programmable but *volatile*.
  Volatile: The data can be only kept while power is turned on.
- Read-Only Memory (ROM): Information can be written into it only once, but it's non-volatile.
  - Some ROM variants allow data to be programmed/erased:
    - **Programmable** ROM (PROM):
      - Irreversibly allow the data to be loaded by the user (write once!).
    - Erasable Reprogrammable ROM (EPROM):
      - Allow the stored data to be erased and new data to be written into it.
      - Provide flexibility for the development of digital systems.
    - **Electrically** EPROM (EEPROM):
      - An EPROM must be physically removed from the circuit for reprogramming, and the stored data cannot be erased selectively.
      - EEPROM can be erased and reprogrammed electrically.
      - Different voltages for erasing/writing/reading increases complexity.

# Read-Only Memory (ROM) (2/2)



- ROM is useful for storing software (or firmware) that is *rarely changed* during the life of the system.
  - E.g., **BIOS**: a program to
    "bootstrap" a computer

 – E.g., plug-in cartridges for video games





#### • Nevertheless, ROM is much slower than RAM.

http://www.c-jump.com/CIS24/Slides/Booting/Booting.html https://en.wikipedia.org/wiki/Read-only\_memory

#### Outline



#### An Overview of Memory

#### Memory Technologies

- Random Access Memory (RAM)
- Read-Only Memory (ROM)
- Non-Volatile Memory (NVM)

#### Solution: Memory Hierarchy

## Non-Volatile Memory (NVM)



- Non-Volatile Memory (NVM): A new approach similar to EEPROM technology
  - NVM can be read, written, and erased, and it's non-volatile.
  - Features: greater density, higher capacity and lower cost, lower power, shock resistant, but still slower than RAM.
  - The most famous/successful example: Flash memory.



There are many other types of NVM: PCRAM (or 3D Xpoint), ReRAM, STTRAM, Racetrack Memory, etc.

# NAND Flash Memory (1/2)





# NAND Flash Memory (2/2)



- The controller of flash memory device is complex.
  - It must perform a myriad of tasks to receive, monitor and deliver data efficiently and reliably.



# 3D XPoint (1/2)



- Intel® Optane<sup>™</sup> DC persistent memory is the latest, innovative memory technology.
  - It delivers affordable large space and data persistence.
    - 10X higher density than DRAM.
  - It adopts 3D XPoint as the memory media.



https://www.intel.com/content/www/us/en/architecture-and-technology/optane-dc-persistent-memory.html https://www.slideshare.net/Syntech/intel-micron-unveil-breakthrough-3d-xpoint-memory-tech-a-revolutionary-breakthrough-in-memory-technology

#### 3D XPoint (2/2)





Technology claims are based on comparisons of latency, density and write cycling metrics amongst memory technologies recorded on published specifications of in-market memory products against internal Intel specifications.

**NVM Solutions Group** 

Intel® Optane™ Technology Workshop



https://arstechnica.com/gadgets/2017/04/intel-optane-memory-how-to-make-revolutionary-technology-totally-boring/

## **Revisit: Memory Technologies**

What is the "best" choice for the computer memory?



https://thememoryguy.com/category/other-current-memory-technologies/

#### Outline



- An Overview of Memory
- Memory Technologies
  - Random Access Memory (RAM)
  - Read-Only Memory (ROM)
  - Non-Volatile Memory (NVM)
- Solution: Memory Hierarchy

## **Mix-and-Match: Best of ALL**



- An ideal memory would be <u>fast</u>, large, and cheap.
- The fact is various memories have its pros and cons.
- ① **SRAM** is fast, but expensive and not very dense:
  - Good choice for providing the user the fastest access
  - Good for registers, L1 and L2 cache in the processor
- ② SDRAM is slower, but cheap and dense:
  - Good choice for providing the user a big memory space
    Good for main memory *volatile*
- ③ NVM/SSD/Disk are even slower, but even cheaper, denser and non-volatile:
  - Good choice for cost-effective and data storage
  - Good for secondary storage

CSCI2510 Lec06: Memory Hierarchy 2022-23 T1

non-volatile

# Solution: Memory Hierarchy (1/2)





## Solution: Memory Hierarchy (2/2)



- Provide the user with as much memory as is available in the cheapest memory technology.
- Provide access at the speed offered by the fastest memory technology.



https://electronics.stackexchange.com/questions/259857/hierarchical-memory-organization

## Summary



- An Overview of Memory
- Memory Technologies
  - Random Access Memory (RAM)
  - Read-Only Memory (ROM)
  - Non-Volatile Memory (NVM)
- Solution: Memory Hierarchy

